Evaluating reliability in wearable devices for sleep staging

Sleep is crucial for physical and mental health, but traditional sleep quality assessment methods have limitations. This scoping review analyzes 35 articles from the past decade, evaluating 62 wearable setups with varying sensors, algorithms, and features. Our analysis indicates a trend towards combining accelerometer and photoplethysmography (PPG) data for out-of-lab sleep staging. Devices using only accelerometer data are effective for sleep/wake detection but fall short in identifying multiple sleep stages, unlike those incorporating PPG signals. To enhance the reliability of sleep staging wearables, we propose five recommendations: (1) Algorithm validation with equity, diversity, and inclusion considerations, (2) Comparative performance analysis of commercial algorithms across multiple sleep stages, (3) Exploration of feature impacts on algorithm accuracy, (4) Consistent reporting of performance metrics for objective reliability assessment, and (5) Encouragement of open-source classifier and data availability. Implementing these recommendations can improve the accuracy and reliability of sleep staging algorithms in wearables, solidifying their value in research and clinical settings.


Rationale
3 Describe the rationale for the review in the context of existing knowledge.p.1 Objectives 4 Provide an explicit statement of the objective(s) or question(s) the review addresses.p.1

METHODS
Eligibility criteria 5 Specify the inclusion and exclusion criteria for the review and how studies were grouped for the syntheses.p.1/ p.2 Information sources 6 Specify all databases, registers, websites, organisations, reference lists and other sources searched or consulted to identify studies.Specify the date when each source was last searched or consulted.
p.1/p.2 Search strategy 7 Present the full search strategies for all databases, registers and websites, including any filters and limits used.p.1/p.2 Selection process 8 Specify the methods used to decide whether a study met the inclusion criteria of the review, including how many reviewers screened each record and each report retrieved, whether they worked independently, and if applicable, details of automation tools used in the process. p.1/p.2

Data collection process
9 Specify the methods used to collect data from reports, including how many reviewers collected data from each report, whether they worked independently, any processes for obtaining or confirming data from study investigators, and if applicable, details of automation tools used in the process.
p.1 / p.2 Data items 10a List and define all outcomes for which data were sought.Specify whether all results that were compatible with each outcome domain in each study were sought (e.g. for all measures, time points, analyses), and if not, the methods used to decide which results to collect.

N/A
10b List and define all other variables for which data were sought (e.g.participant and intervention characteristics, funding sources).Describe any assumptions made about any missing or unclear information.

N/A
Study risk of bias assessment 11 Specify the methods used to assess risk of bias in the included studies, including details of the tool(s) used, how many reviewers assessed each study and whether they worked independently, and if applicable, details of automation tools used in the process.

N/A
Effect measures 12 Specify for each outcome the effect measure(s) (e.g.risk ratio, mean difference) used in the synthesis or presentation of results.N/A Synthesis methods 13a Describe the processes used to decide which studies were eligible for each synthesis (e.g.tabulating the study intervention characteristics and comparing against the planned groups for each synthesis (item #5)). p.2 13b Describe any methods required to prepare the data for presentation or synthesis, such as handling of missing summary statistics, or data conversions.

N/A
13c Describe any methods used to tabulate or visually display results of individual studies and syntheses.N/A 13d Describe any methods used to synthesize results and provide a rationale for the choice(s).If meta-analysis was performed, describe the model(s), method(s) to identify the presence and extent of statistical heterogeneity, and software package(s) used.
any methods used to explore possible causes of heterogeneity among study results (e.g.subgroup analysis, meta-regression).N/A 13f Describe any sensitivity analyses conducted to assess robustness of the synthesized results.N/AReporting bias assessment14 Describe any methods used to assess risk of bias due to missing results in a synthesis (arising from reporting biases).N/A Certainty assessment15 Describe any methods used to assess certainty (or confidence) in the body of evidence for an outcome.results of the search and selection process, from the number of records identified in the search to the number of studies included in the review, ideally using a flow diagram.p.216b Cite studies that might appear to meet the inclusion criteria, but which were excluded, and explain why they were excluded.outcomes, present, for each study: (a) summary statistics for each group (where appropriate) and (b) an effect estimate and its precision (e.g.confidence/credible interval), ideally using structured tables or plots.N/AResults of syntheses 20a For each synthesis, briefly summarise the characteristics and risk of bias among contributing studies.N/A 20b Present results of all statistical syntheses conducted.If meta-analysis was done, present for each the summary estimate and its precision (e.g.confidence/credible interval) and measures of statistical heterogeneity.If comparing groups, describe the direction of the effect.N/A 20c Present results of all investigations of possible causes of heterogeneity among study results.N/A 20d Present results of all sensitivity analyses conducted to assess the robustness of the synthesized results.N/A Reporting biases 21 Present assessments of risk of bias due to missing results (arising from reporting biases) for each synthesis assessed. of certainty (or confidence) in the body of evidence for each outcome assessed.N/A DISCUSSION Discussion 23a Provide a general interpretation of the results in the context of other evidence.p.7 23b Discuss any limitations of the evidence included in the review.p.7 23c Discuss any limitations of the review processes used.p.7 23d Discuss implications of the results for practice, policy, and future research.p.7/p.8